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Abstract 

In vivo assembly of overlapping fragments by homologous recombination in 
Saccharomyces cerevisiae is a powerful method to engineer large DNA constructs. 
Whereas most in vivo assembly methods reported to date result in circular vec- 
tors, stable integrated constructs are often preferred for metabolic engineering 
as they are required for large-scale industrial application. The present study 
explores the potential of combining in vivo assembly of large, multigene expres- 
sion constructs with their targeted chromosomal integration in S. cerevisiae. 
Combined assembly and targeted integration of a ten-fragment 22-kb construct 
to a single chromosomal locus was successfully achieved in a single trans- 
formation process, but with low efficiency (5% of the analyzed transformants 
contained the correctly assembled construct). The meganuclease I-Scel was 
therefore used to introduce a double-strand break at the targeted chromosomal 
locus, thus to facilitate integration of the assembled construct. I-Scel-assisted 
integration dramatically increased the efficiency of assembly and integration of 
the same construct to 95%. This study paves the way for the fast, efficient, and 
stable integration of large DNA constructs in S. cerevisiae chromosomes. 



Introduction 

The yeast Saccharomyces cerevisiae is intensively explored 
and applied as a platform for the industrial production of 
a wide range of endogenous and heterologous compounds. 
Its success in this role can be readily explained from its 
robustness, simple nutritional requirements, and easy 
genetic accessibility. This last feature has recently propelled 
the popularity of S. cerevisiae as a preferred platform in 
synthetic biology, especially for the assembly of large DNA 
constructs (Gibson et al, 2008). Improvement on the 
performance of industrial organisms (higher productivity 
and yield, increased robustness, expression of complex het- 
erologous pathways, etc.) by metabolic engineering 
requires simultaneous expression of dozens of genes. Han- 
dling such numbers of genes by classical cloning methods 
is extremely time-consuming. Over the last decade, several 
methods have been developed for fast and efficient assem- 



bly of large DNA constructs. The most promising of these 
are recombination-based methods in which multiple linear 
DNA fragments with overlapping terminal sequences are 
recombined into a single vector (Ellis et al, 2011). While 
in vitro recombination-based methods such as SLIC (Li & 
Elledge, 2007), InFusion (Benoit et al, 2006) and Gibson's 
isothermal assembly (Gibson et al, 2009) are undeniably 
valuable, in vivo recombination using S. cerevisiae is prov- 
ing to be the method of choice for large constructs assem- 
bled from many fragments (Gibson et al, 2008; Shao 
et al, 2009). This method can be used for efficient and 
accurate plasmid assembly (Kuijpers et al, 2013). How- 
ever, plasmid-borne gene expression is not favored for 
industrial-scale production as plasmids are notoriously 
unstable, and maintaining the selection pressure necessary 
for the cells to retain the plasmid is typically difficult to 
achieve in industrial settings (Zhang et al, 1996). Con- 
versely, chromosomal integration results in stable expres- 
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sion of genes, but methods capable of rapid and accurate 
assembly and integration of large constructs are currently 
not available. A recent pioneering study has demonstrated 
that S. cerevisiae can, in a single step, assemble multiple 
fragments in a 23.7-kb construct and integrate this con- 
struct into the yeast chromosome (Shao et al, 2009). To 
increase the probability of integration events, the abundant 
(5-sites were chosen as integration loci. Although reason- 
able efficiencies were obtained (up to 70% correct trans- 
formants), targeting to (5-sites randomizes the number and 
location of integration sites and therefore results in unpre- 
dictable copy numbers and integration loci. For strain 
engineering programs, where high integration efficiencies 
are required, control of the locus and the number of inte- 
gration events is of paramount importance. Considering 
the relatively low chromosomal integration efficiencies of 
linear DNA fragments in S. cerevisiae [c. 10~ 6 transfor- 
mants per viable cell using 50-bp flanks homologous to 
the integration site (Storici et al., 2003)], it is not surpris- 
ing that targeting a single specific chromosomal site for 
combined assembly and integration of a multigene frag- 
ment results in low efficiencies. The aim of the present 
study was to evaluate the potential of Combined in vivo 
Assembly and Targeted chromosomal Integration (from 
now on referred to as CATI) of large DNA constructs in 
S. cerevisiae. Preliminary results identified integration as 
the bottleneck for the CATI approach. Use of the mega- 
nuclease I-Scel was therefore explored to create double- 
strand DNA breaks and thereby enhance the integration 
efficiency. I-Scel and its homolog I-SceII are native to 
S. cerevisiae, in which they are encoded by mitochondrial 
introns (Watabe et al, 1983; Shibata et al, 1984; Monteil- 
het et al, 1990). These meganucleases, also named homing 
endonucleases, are responsible for intron mobility in the 
mitochondria of yeast, in which they initiate a site-specific 
gene conversion (Plessis et al, 1992). Much like the well- 
studied HO meganuclease, I-Scel initiates a double-strand 
break at a specific recognition site. The recognition site of 
I-Scel extends over a 18-bp nonsymmetrical sequence and 
generates a cut with a 4-bp overhang within its recognition 
site (Monteilhet et al, 1990). In the early 90s, it was dem- 



onstrated that I-Scel, when expressed in the nucleus, was 
active on nuclear targets and, as predicted from the 
absence of I-Scel cutting sites in the genomic DNA, was 
not toxic upon expression in wild-type S. cerevisiae (Plessis 
et al, 1992). In the presented work, I-Scel was imple- 
mented and investigated to develop a robust system for 
combined assembly and targeted chromosomal integration 
of multigene constructs in S. cerevisiae. 

Materials and methods 
Strains and media 

The S. cerevisiae strains used in this study are derived from 
the CEN.PK family (Table 1; Entian & Kotter, 2007; Nijk- 
amp et al, 2012). Cultures for transformation were grown 
in complex medium containing 10 g L~ Bacto yeast 
extract, 20 g L _1 Bacto peptone, and 20 g L _1 glucose as 
carbon source. When galactose induction of SCEI was 
required, cultures were transferred to synthetic medium 
(SM) containing galactose as the sole carbon source and 
grown for 4 h on that medium prior to transformation. 
SM contained, per liter of demineralized water, 5 g 
(NH 4 ) 2 S0 4 , 3 g KH 2 P0 4 , 0.5 g MgS0 4 .7-H 2 0, and trace 
elements (Verduyn et al, 1992). Vitamins (Verduyn et al, 
1992) were added after heat sterilization of the medium at 
120 °C for 20 min. Glucose or galactose was separately 
sterilized at 110 °C and added to a final concentration of 
20 g L _1 . Where required, the medium was supplemented 
with appropriate amounts of auxotrophic requirements 
(Pronk, 2002). Solid medium was prepared by adding 2% 
(w/v) agar to the media prior to heat sterilization. Selective 
medium for the amdS marker was prepared as previously 
described (Solis-Escalante et al, 2013). 

Molecular biology techniques 

PCR amplification was performed using Phusion® Hot 
Start II High-Fidelity DNA Polymerase (Thermo Fisher 
Scientific, Waltham, MA). To improve PCR efficiency, 
the conditions in the PCR as recommended by the sup- 



Table 1. Strains used 


in this study 




Strain 


Relevant genotype 


Source 


CEN.PK113-7D 


MATa MAL2-8C SUC2 


Nijkamp ef al. (2012) and 






van Dijken ef al. (2000) 


CEN.PK113-5D 


MAT a ura3-52 MAL2-8c SUC2 


van Dijken ef al. (2000) 


CEN.PK102-3A 


MATa ura3-52 leu2-3 MAL2-8c SUC2 


van Dijken et al. (2000) 


IMX212 


MATa ura3-52 leu2-3 MAL2-8c SUC2 spr3::(P GALr SCEI-T cyc1 ; KIURA3) 


This study 


IMX221 


MATa ura3-52 MAL2-8c SUC2 spr3::(TagG-KIURA3- P GAL ,-SCEI-T cyc1 -TagF) 


This study 


IMX222 


MATa ura3-52 leu2-3 MAL2-8c SUC2 spr3::(TagG-KIURA3- P GAU -SCEI-T cvc ,-TagF) 


This study 


IMX224 


MATa ura3-52 MAL2-8c SUC2 spr3::(TagG-amdSYM-TagF) 


This study 
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plier were modified by decreasing the primer concentra- 
tion from 500 to 200 nM and increasing the Phusion™ 
Hot Start High-Fidelity polymerase concentration from 
0.02 to 0.03 U HIT 1 . All other conditions followed the 
manufacturer's instructions. Genomic template DNA was 
isolated from S. cerevisiae CEN.PK113-7D using the 
Qiagen 100/G kit (Qiagen, Hilden, Germany). Plasmids 
maintained in E. coli DH5oc were isolated with the 
GenElute™ Plasmid Miniprep Kit (Sigma, St. Louis, MI). 
DNA fragments were separated on 1% (w/v) agarose 
(Sigma) gels in 1 x TAE (40 mM Tris-acetate pH 8.0 and 
1 mM EDTA) and in 2% (w/v) agarose in 0.5 x TBE 
(45 mM Tris-borate pH 8.0 1 mM EDTA) when frag- 
ments were smaller than 500 bp. Fragments were isolated 
from gel using the Zymoclean Gel DNA Recovery kit 
(Zymo Research, Irvine, CA). The glycolytic gene frag- 
ments for assembly were not gel-purified, but concen- 
trated directly after PCR amplification by Vivacon® 500 
spin columns (Sartorius Stedim, Aubagne, France). DNA 
concentrations were measured in a NanoDrop 2000 spec- 
trophotometer (wavelength 260 nm; Thermo Fisher Sci- 
entific). Genomic DNA of transformants was isolated 
using the YeaStar™ Genomic DNA kit (Zymo Research). 
Multiplex PCR was performed with primers (Table 2) at 
a concentration of 150 nM. Cycling parameters were 
94 °C for 3 min, then 35 cycles of 94 °C for 30 s, 60 °C 
for 90 s, and 72 °C for 60 s, followed by a 10-min incu- 
bation at 72 °C. Prior to transformation, fragments were 
pooled, maintaining equimolar concentrations with the 
marker fragment. Transformation to yeast was performed 
with the LiAc/ssDNA method (Gietz & Woods, 2002). 

Construction of a platform strain for 
l-Scel-assisted integration 

The plasmids used in this study are listed in Table 3. 
Plasmid pUDC073 was obtained by cloning the SCEI 
ORE into pAG416GAL-ccdB. The SCEI ORF was ampli- 
fied from Biobrick BBa_Kl 75041 (http://parts.igem.org/ 
Part:BBa_K175041) with primers SC£J-Fw and SCEI-Rv. 
The resulting fragment was restricted by Spel and Xhol 
and ligated into pAG416GAL-ccdB, yielding pUDC073. 

The integration site was constructed by integration of 
the TagG-SCEf-l/RA3-TagF cassette into the yeast genome. 
Construction of the TagG-SCET-L'KA3-TagF cassette was 
performed in multiple steps. First, SCEI was amplified from 
pUDC073 with primers SC£J-Fw(2) and SC£7-Rv(2), and 
KIURA3 was amplified from pUG72 with primers URA-Fw 
and URA-Rv. The resulting cassettes were gel-purified, and 
100 ng of each cassette was used for fusion-PCR (Shevchuk 
et al, 2004) using primers FUS1 and FUS2. Cycling param- 
eters were 98 °C for 1 min, then eight cycles of 98 °C for 
30 s, 58 °C for 30 s, and 72 °C for 120 s, followed by 27 



cycles of 98 °C for 30 s, 70 °C for 30 s, and 72 °C for 
120 s, followed by a 10-min incubation at 72 °C. The 
intermediate strain IMX212 was constructed by integration 
of the resulting product at the SPR3 locus of CEN.PK102-3A, 
yielding strain IMX212 (Fig. lal). Secondly, genomic DNA 
of IMX212 served as a template for amplification of the 
SCEI/URA3 cassette with primers SCEI+URA-Fw and 
SCEI+URA-Rv, resulting in fragment XI (Fig. Ia2). 
Thirdly, the flanking fragments X2 and X3 carrying 3 
regions: (1) the regions homologous to the SPR3 locus nec- 
essary for integration of the final cassette, (2) the I-Scel rec- 
ognition site, and (3) the F and G synthetic homologous 
recombination sequences (SHR-sequences) required for 
integration of cassettes during I-Scel-assisted integration 
were obtained by PCR. Fragment X2 (Fig. Ia2) was 
obtained by annealing oligonucleotides TagG-REC-Fw and 
TagG-REC-Rv in a 50 uL PCR mix at a concentration of 
1 uM. This mix was subjected to 10 cycles in a thermocy- 
cler using the following conditions: 98 °C for 30 s, 65 °C 
for 30 s, and 72 °C for 15 s, followed by a 10-min incuba- 
tion at 72 °C. Fragment X3 was obtained by the same 
procedure using oligonucleotides TagF-REC-Fw and 
TagF-REC-Rv (Fig. Ia2). Fourthly, the DNA fragments XI, 
X2, and X3 were gel-purified and fused by fusion-PCR 
using primers FUS1 and FUS2, using the same cycling 
parameters as for the previous fusion-PCR (Fig. Ia2). The 
resulting product was the TagG-SC£I-tZRA3-TagF cassette 
(Fig. Ia3), which was gel-purified and transformed using 
the LiAc/ssDNA method (Gietz & Woods, 2002) to 
CEN.PK113-5D, leading to IMX221, and to CEN.PK102- 
3A, leading to IMX222. In both strains, the regions con- 
taining the I-Scel recognition sites and the SHR-sequences 
F and G were PCR-amplified and sequenced using Sanger 
sequencing (BaseClear, Leiden, the Netherlands). 

Preparation of fragments for in vivo assembly 

Fragments for in vivo assembly were obtained by PCR 
from either genomic or plasmid template DNA. The 
amplified fragments were stocked in TE buffer (10 mM 
Tris, pH8, 1 mM EDTA). Fragments amplified from plas- 
mid templates were subjected to gel extraction to prevent 
false-positive transformants that might arise from con- 
tamination with linearized template plasmid. Fragment 
amdSYM AB carrying the counter selectable amdS marker 
behind the TDH3 promoter was amplified from pUDE158 
(Table 3) in the experiment targeting the CAN1 locus 
using primers Amds-GPD-Fw+A and Amds-GPD-Rv+B. 
In the other experiments, an amdSYM AB cassette carrying 
the amdS marker behind the AgTEF2 promoter was used, 
to eliminate sequence homology between the marker cas- 
sette and the yeast genome. This cassette was amplified 
from pVGamdSYM with primers Amds-TEF-Fw+A and 
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Table 2. Oligonucleotide primers used in this study 



Primers 



Sequence 5' -» 3' 



To add SHR-sequences 
TPI1 Rv+H 



AGATTACTCTAACGCCTCAGCCATCATCGGTAATAGCTCGAATTGCTGAGAACCCGTGACTAGTGTGAGCGGGATTTAAACT 
GTG 

GCCTACGGTTCCCGAAGTATGCTGCTGATGTCTGGCTATACCTATCCGTCTACGTGAATAGCGAAAATGACGCTTGCAGTG 
GTCACGGGTTCTCAGCAATTCGAGCTATTACCGATGATGGCTGAGGCGTTAGAGTAATCTAAAATCTCAAAAATGTGTGGGTC 
ATTACG 

GCCAGAGGTATAGACATAGCCAGACCTACCTAATTGGTGCATCAGGTGGTCATGGCCCTTAGTGCATGACAAAAGATGAGCT 
AGG 

ACTATATGTGAAGGCATGGCTATGGCACGGCAGACATTCCGCCAGATCATCAATAGGCACGCTGGAGCTCTTCGA 
GTTGAACATTCTTAGGCTGGTCGAATCATTTAGACACGGGCATCGTCCTCTCGAAAGGTGGGCCGCAAATTAAAGCCTTCGAG 
ACTATATGTGAAGGCATGGCTATGGCACGGCAGACATTCCGCCAGATCATCAATAGGCACGCGACATGGAGGCCCAGAAT 
ACC 

GTTGAACATTCTTAGGCTGGTCGAATCATTTAGACACGGGCATCGTCCTCTCGAAAGGTGAGTATAGCGACCAGCATTCACAT 
ACG 

ACTATATGTGAAGGCATGGCTATGGCACGGCAGACATTCCGCCAGATCATCAATAGGCACAGAGATCCGCAGGCTAACCG 
GTTGAACATTCTTAGGCTGGTCGAATCATTTAGACACGGGCATCGTCCTCTCGAAAGGTGGCTGTGAAGATCCCAGCAAAGG 
TGCCGAACTTTCCCTGTATGAAGCGATCTGACCAATCCTTTGCCGTAGTTTCAACGTATGATAGCCATTCTCTGCTGCTTTGTTG 
GGCCGTCATATACGCGAAGATGTCCAAGCAGGTAGAACACATAGTCTGAGCATCTCGTCGGAGATCCGAGGGACGTTT 
ATTGG 

ACGCATCTACGACTGTGGGTCCCGTGGAGAAATGTATGAAACCCTGTATGGAGAGTGATTTCGAGATTCCTCAATCCATACAC 
CATTATAG 

CGACGAGATGCTCAGACTATGTGTTCTACCTGCTTGGACATCTTCGCGTATATGACGGCCTGTCGTCTTCGTGAACCATTGTC 
AATCACTCTCCATACAGGGTTTCATACATTTCTCCACGGGACCCACAGTCGTAGATGCGTCTGAAGAAGGCATACTACGCC 
AAG 

ACGTCTCACGGATCGTATATGCCGTAGCGACAATCTAAGAACTATGCGAGGACACGCTAGTTCGCGACACAATAAAGTCTT 
CACG 

CTAGCGTGTCCTCGCATAGTTCTTAGATTGTCGCTACGGCATATACGATCCGTGAGACGTGCAAGAGAAAAAAACGAGCAATT 
GTTAAAAG 

CACCTTTCGAGAGGACGATGCCCGTGTCTAAATGATTCGACCAGCCTAAGAATGTTCAACGACGGCACCGGGAAATAAACC 
GTGCCTATTGATGATCTGGCGGAATGTCTGCCGTGCCATAGCCATGCCTTCACATATAGTCCTGCATTTAAAGATGCCGAT 
TTGG 

GTAGACGGATAGGTATAGCCAGACATCAGCAGCATACTTCGGGAACCGTAGGCATTTTAGCGTAAAGGATGGGGAAAGAG 
For the construction of pUDC073 
SCEI-fw GCTGCCACTAGTATAATGCATCAAAAAAACCAGGTAATG 
SCEI-Rv TTATC ACTC GAGTTATTACTTAAG GAAAGTTTC G G AG G AG ATAG 

For fusion-PCR of the lscel-URA3-cassette 

l/RAS-Fw GAGCCATCCATTCGTAATTCACTACTGCCTGAGGGTTGTTCTCAGAAGCTCATCGAACTGTCATC 

URA3-Rv CCATTCTGTAGCCACCTTATCCATGACCGTTTTATTAATTATTTCATAGCACTTGTAATTATATTACCCTGTTATCCCTAGCGAAGT 
GAGTGTTGCACCGTGCCAATG 

5Cf/-Fw(2) GCTGCATCCTTCCCATGCAAAGTGTCTTCGTATTTAGTGATGTTTTGTTAGCGACACAAAGCTAGGGATAACAGGGTAATATGC 

AGTGAGCGCAACGCAATTAATG 
SCf/-Rv(2) AACAACCCTCAGGCAGTAGTGAATTACGAATGGATGGCTCCGACTCACTATAGGGCGAATTGG 
FUS1 GCTGCATCCTTCCCATGCAAAGTG 
FUS2 C C ATTCTG TAGCCACCTTATCC 

SCEI+URA-fw GC AGTGAGCGCAACGCAATTAATG 

SCEI+URA-Rv GAAGTGAGTGTTGCACCGTGCCAATG 
Tag F-REC-fw CCATTCTGTAGCCACCTTATCCATGACCGTTTTATTAATTATTTCATAGCACTTGTAATTTGCCGAACTTTCCCTGTATGAAGCGA 
TCTG AC C AATC CTTTG C C G 

Tag F-REC-rv CCTGCATTGGCACGGTGCAACACTCACTTCGCTAGGGATAACAGGGTAATATCATACGTTGAAACTACGGCAAAGGATTGGT 
CAGATCGCTTCATACAGGG 

Tag G-REC-fw GCTGCATCCTTCCCATGCAAAGTGTCTTCGTATTTAGTGATGTTTTGTTAGCGACACAAAGCCAGAGGTATAGACATAGCCAGA 
CCTACCTAATTGGTGCATC 

Tag G-REC-rv GTAACTCACATTAATTGCGTTGCGCTCACTGCATATTACCCTGTTATCCCTAGCAAGGGCCATGACCACCTGATGCACCAATTA 
GGTAGGTCTGGCTATGTCTATACC 
For construction of the fragments targeting the CAW? locus 
H1-Fw TTCTAGGTTCGGGTGACGTGAAG 



TPI1 Fw+I 
FBA1 Rv+H 

FBA1 Fw+G 

Amds-GPD Fw+A 
Amds-GPD Rv+B 
/Amds-TEF Fw+A 

Amds-JEf RV+B 

KILEU2-fw+A 
KILEU2-RV+B 
PFK2 Rv+F 
PFK2 Fw+J 

PFK1 Rv+D 

PFK1 Fw+J 
PGI1 Rv+D 

PGI1 Fw+C 

HXK2 Rv+C 

HXK2 Fw+B 
PGK1 Fw+I 

PGK1 Rv+A 
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Primers Sequence 5' -> 3' 

H1-Rv AAGGGCCATGACCACCTGATGCACCAATTAGGTAGGTCTGGCTATGTCTATACCTCTGGCATTACCCTGTTATCCCTATTAATC 
ACATTCCCACGCCATTTCG 

Y1 CATACGTTGAAACTACGGCAAAGGATTGGTCAGATCGCTTCATACAGGGAAAGTTCGGCATAGGGATAACAGGGTAATGCTC 
ATTGATC CCTTAAACTTTCTTTTC G GTGTATG AC 

Y2 CCAGTTTTCAATCTGTCGTCAATCGAAAGTTTATTTTAATCACATTCCCACGCCATTTCGCATTCTCACCCTCATAAGTCATACAC 
CGAAAAGAAAGTTTAAGGGATCAATGAGC 

H2-Fw AATAAACTTTCGATTGACGACAGATTG 

H2-Rv GTTTCCGGGTGAGTCATACG 

FUS3 C ATAC GTTG AAACTAC G G C AAAG G 
For analytical PCR: glycolytic genes integrated in the CAN1 locus 

G- fw CCGTCATCGGAGTCGTTATCAG 

G- rv GCTCTTTTCTTCTGAAGGTCAATG 

F-fw GACGCCATTTGGAACGAAAAAAAG 

F-rv TAACGGCAAACAGCAAAGGC 

H-fw GTTACGTGCTCAGTTGTTAGATATG 

H-rv GCAGAAGTGTCTGAATGTATTAAGG 

l-fw TGAGCCACTTAAATTTCGTGAATG 

l-rv TTTCTCTTTCCCCATCCTTTACG 

A-fw AAGGATTCGCGCCCAAATCG 

A-rv CTTCCCAAGATTGTGGCATGTC 

B-fw TGGCTATCGCTGAAGAAGTTGG 

B-rv AC G G AATAG AAC AC G ATATTTG C 

C-fw TCACGGGATTTATTCGTGACG 

C-rv CCCACGATGCTTCTACCAAC 

D-fw ACTCGCCTCTAACCCCACG 

D-rv AATCATGTTGATGACGACAATGG 

J-fw GCTTAATCTGCGTTGACAATGG 

J-rv C AATAAAC GTC C CTC G G ATCTC 
For multiplex PCR: glycolytic genes integrated in the SPR3 locus 

G- fw CTTGGCTCTGGATCCGTTATCTG 

G- rv GCTCTTTTCTTCTGAAGGTCAATG 

F-fw GACGCCATTTGGAACGAAAAAAAG 

F-rv TTGG G CTG G AC GTTC C G AC ATAG 

H-fw GTTACGTGCTCAGTTGTTAGATATG 

H-rv GCAGAAGTGTCTGAATGTATTAAGG 

l-fw TGAGCCACTTAAATTTCGTGAATG 

l-rv TTTCTCTTTCCCCATCCTTTACG 

A-fw AAGGATTCGCGCCCAAATCG 

A-rv CTTCCCAAGATTGTGGCATGTC 

B-fw TGGCTATCGCTGAAGAAGTTGG 

B-rv ACGGAATAGAACACGATATTTGC 

C-fw TCACGGGATTTATTCGTGACG 

C-rv CCCACGATGCTTCTACCAAC 

D-fw ACTCGCCTCTAACCCCACG 

D-rv AATCATGTTGATGACGACAATGG 

J-fw GCTTAATCTGCGTTGACAATGG 

J-rv C AATAAAC GTCCCTCGGATCTC 



Amds-TEF-Rv+B. The marker fragment KILEU2 AB was 
obtained from pUG73 with primers KlLEU2-¥w+A and 
KILEU2-Rv+Ji. The fragments containing the glycolytic 
genes were all amplified from CEN.PK113-7D genomic 
DNA (Nijkamp et at, 2012) using the primers with the 
corresponding glycolytic gene names listed in Table 2. 
Fragments HI and H2 homologous to the CAN1 locus 



and used for targeted integration of in vivo-assembled con- 
structs at that locus were obtained by PCR amplification 
from CEN.PK113-7D genomic DNA. HI was amplified 
with primers Hl-Fw and Hl-Rv. Fragment H2 was 
obtained in two steps (Fig. lb). First fragment Zl was 
obtained by fusion of oligos Yl and Y2 in the same way 
as described for fragment X2 (Fig. Ib2). Fragment Z2 



FEMS Yeast Res 13 (2013) 769-781 



© 2013 The Authors. FEMS Yeast Research 
published by John Wiley & Sons Ltd on behalf of Federation of European Microbiological Societies 



774 



N.G.A. Kuijpers ef al. 



Table 3. Plasmids used 


in this study 




Plasmid 


Characteristic 


Source 


pUG72 

pUG73 

pUGamdSYM 

pUDE158 

pAG416GAL-ccdB 

pUDC073 


PCR template for Kluyveromyces lactis URA3 (KIURA3) 

PCR template for Kluyvermyces lactis LEU2 (KILEU2) 

PCR template for amdSYM under the control of the AgTEF2 promoter 

PCR template for amdSYM under the control of the TDH3 promoter 

CEN6/ARS4 ori, URA3, P GALr ccdB-J C YC, 

CEN6/ARS4 ori, URA3, P GALr SCEI-J CYC1 


Gueldener ef al. (2002) 
Gueldener ef al. (2002) 
Solis-Escalante ef al. (2013) 
Solis-Escalante ef al. (2013) 
Alberti ef al. (2007) 
This study 



was obtained by PCR on genomic DNA using primers 
H2-Fw and H2-Rv (Fig. lbl). Fragment Zl and Z2 were 
gel-purified and fused by fusion-PCR using the same 
method as described before using primers H2-rv and 
FUS3, resulting in fragment H2 (Fig. Ib3). Fragments HI 
and H2 were gel-purified before addition to the transfor- 
mation mix. 

Results 

Poor efficiency of simultaneous assembly and 
targeted integration of seven glycolytic genes 
into the CAN1 locus 

We demonstrated previously that S. cerevisiae can assem- 
ble nine fragments very efficiently and with high fidelity 
into a 21-kb plasmid carrying six glycolytic genes 
(Kuijpers et al, 2013). To test whether combined assembly 
and targeted integration at a specific locus of a multiple 
gene construct could be achieved, the six previously 
designed glycolytic gene fragments (Kuijpers et al, 2013) 
and one additional glycolytic gene fragment were used to 
assemble and integrate a total of seven glycolytic genes in a 
single step (Fig. 2). A set of ten fragments was obtained by 
adding two flanking fragments designed to target the 
CAN1 locus (Whelan et al, 1979) and one marker frag- 
ment carrying the amdS dominant marker [Fig. 2, (Solis- 
Escalante et al, 2013)] to the seven glycolytic gene cas- 
settes. All fragments were designed to overlap by 60-bp 
synthetic homologous recombination sequences (from 
now on referred to as SHR-sequences) that do not share 
homology with the yeast genome (Kuijpers et al, 2013). 
After transformation with these ten fragments, yeast cells 
were grown on glucose synthetic medium. To identify 
transformants in which genomic integration of amdS had 
occurred, acetamide was used as the sole nitrogen source. 
Thirty-five transformants were obtained, of which 20 were 
picked and plated on medium containing L-canavanine, to 
select for integration of the assembled construct in the 
CAN1 locus. Only two of the 20 transformants were able 
to grow in the presence of L-canavanine, indicating that 
CAN1 was disrupted in only 10% of the tested transfor- 
mants. Of these two L-canavanine-resistant transformants, 



only one was correctly assembled and carried all 
transformed fragments (Fig. 3). Although simultaneous 
assembly and targeted integration of a multigene construct 
in a single chromosomal locus was achieved, it was extre- 
mely inefficient (one of 20 of the tested transformants). As 
we previously established efficiencies of plasmid assembly 
of 95% with the same overlapping sequences (Kuijpers 
et al, 2013), these results pointed at the integration step as 
the main bottleneck for the CATI approach. 

Substantial improvement of integration 
efficiency of an amdS marker cassette into the 
SPR3 locus using l-Scel-induced double-strand 
DNA breaks 

The cellular function of homologous recombination is to 
repair double-strand DNA breaks (DSBs). While in vivo 
assembly supplies DNA fragments with open ends, readily 
accessible for the homologous recombination machinery, 
chromosomal integration of these fragments requires 
recombination of DNA fragments with intact genomic 
DNA and is therefore far less likely to occur (Orr- Weaver 
et al, 1981; Leem et al, 2003). A way to enhance the 
efficiency of integration would therefore be to artificially 
introduce a DSB at the integration site. Rare-cutting 
endonucleases can be used to introduce DSBs, thereby 
drastically increasing the efficiency of integration by 
homologous repair (Storici et al, 2003; Wingler & 
Cornish, 2011). The well-studied I-Scel meganuclease, 
originally encoded by the S. cerevisiae mitochondrial gene 
SCEI, has a 18-bp unique recognition sequence and has 
been previously functionally expressed in the nucleus of 
S. cerevisiae (Plessis et al, 1992). To investigate whether 
introduction of DSBs might eliminate or alleviate the bot- 
tleneck in chromosomal integration, a platform was engi- 
neered to combine in vivo assembly with I-Scel-facilitated 
integration. IMX221 was constructed by integration of a 
cassette containing SCEI under the control of the induc- 
ible GAL1 promoter and a uracil marker in the SPR3 
locus of the uracil auxotroph S. cerevisiae CEN.PK113-5D 
(Fig. 4a). The resulting locus carried two 22-bp I-Scel 
recognition sequences flanked by 60-bp SHR-sequences G 
and F (Fig. 4a). A single cassette, carrying the amdS mar- 
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Fig. 1. Construction of the TagG-SCf/-/C/L//?A3-TagF and the H2 cassette, (a) First, the 5CEI/URA3 cassette was obtained by PCR on genomic 
DNA of IMX212 with primers SCEI+URA-Fw and SCEI+URA-Rv, resulting in fragment X1 (a1). Fragment X2 was obtained by fusing oligos 
TagG-REC-Fw and TagG-REC-Rv in an independent PCR. Fragment X3 was obtained in the same way, using oligos TagF-REC-fw and TagF-REC- 
Rv (a2). Fragments X1, X2, and X3 were fused in a fusion-PCR with primers FUS1 and FUS2, resulting in the TagG-SCE/-K7l//?/43-TagF cassette 
(a3). (b) Fragment Z2 was obtained by PCR on genomic DNA of CEN.PK113-7D with primers H2-fw and H2-rv (b1). Fragment Z1 was obtained 
by fusing oligos Y1 and Y2 in a PCR (b2). Fragments Z1 and Z2 were fused in a fusion-PCR with oligos FUS3 and H2-rv, resulting in fragment 
H2, which contains SHR-sequence F and 300 bp homology to the CAN1 locus (b3). 



ker and SHR-sequences G and F at its 5' and 3' ends, 
respectively, was constructed to integrate at this synthetic 
locus (Fig. 4b). This cassette was transformed to IMX221 
preincubated in galactose medium to induce expression 



of SCEI. To quantify the effect of the I-Scel-induced 
DSB, IMX221 cells not induced on galactose were 
transformed with the same amdSYM cassette and used as 
a negative control. Previous reports indicated that incuba- 



FEMS Yeast Res 13 (2013) 769-781 



© 2013 The Authors. FEM5 Yeast Research 
published by John Wiley & Sons Ltd on behalf of Federation of European Microbiological Societies 



776 



N.G.A. Kuijpers ef al. 




CANl 




FBA1 TPI1 PGK1 amdS HXK2 PGI1 PFK1 PFK2 




GHIABCDJ F 



Fig. 2. Combined assembly and integration of seven glycolytic genes in the CANI locus of Saccharomyces cerevisiae. Ten overlapping DNA 
fragments, containing seven glycolytic genes, the amdS selection marker, and the two flanking fragments H1 and H2, carrying 300-bp sequences 
homologous to the CANI integration locus, were cotransformed to S. cerevisiae and assembled in yeast via homologous recombination into a 
single large integration cassette. 60-bp SHR-sequences were used to promote in vivo assembly of the fragments. 
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Fig. 3. PCR analysis of a positive transformant after cotransformation of ten overlapping fragments to Saccharomyces cerevisiae. The PCRs were 
designed to produce amplicons covering the indicated junctions. PCR products covering junctions H, C, D, J, and I were separated on a 2% 
agarose gel, and PCR products covering junctions A,F,G, and B were separated on a 1% agarose gel by electrophoresis. In the lane labeled 150,' 
a 50-bp Gene Ruler ladder was loaded; in the lane labeled 'Lmix,' a Gene Ruler Mix ladder was loaded; sizes are indicated. All amplicons 
matched the expected size, thereby indicating correct assembly and integration of seven glycolytic genes in the CAN1 locus. 



tion of strains containing SCEI under control of the 
GAL1 promoter in the presence of galactose resulted in 
induction of DSBs within several hours (Storici ef al, 
2003). In the present study, SCEI was induced by growing 
the yeast cells for four hours in galactose medium 
prior to transformation. While transformation of 
I-Scel-expressing cells resulted in c. 10 4 transformants, the 
negative control yielded only 15 transformants. PCR 
analysis of three colonies of each of the I-Scel-expressing 
and nonexpressing transformants revealed that they all 
contained the amdSYM cassette, correctly integrated at the 
SPR3 locus. One correct clone resulting from the transfor- 
mation of induced cells was named IMX224. Sequencing 
of the SPR3 locus of IMX224 showed that the region 



between the SHR-sequences G and F was successfully 
replaced by the amdSYM cassette without leaving any scar 
of the I-Scel recognition sequences. These results demon- 
strate that induction of a DSB is a critical step for integra- 
tion of DNA fragments in yeast chromosomes and 
suggested that I-Scel-assisted integration should improve 
the efficiency of CATI. 

I-Scel-assisted integration of seven glycolytic 
genes at a synthetic chromosomal locus 

To test whether I-Scel-assisted integration could be com- 
bined with in vivo assembly of multiple genes, the same 
seven overlapping glycolytic gene cassettes used in the 
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Fig. 4. Design of the l-Scel-facilitated CATI method, (a) First, the platform strain was obtained by introducing a cassette containing SCEI and 
KIURA3 flanked by three regions, the l-Scel recognition site, synthetic recombination sequences G and F, and flanking regions homologous to the 
targeted locus SPR3. (b) Induction of plasmid-borne SCEI in the platform strain prior to transformation, causing excision of the SCEI/URA3 
fragment and leaving the 60-bp SHR-sequences F and G exposed for recombination. Transformation of the induced yeast cells with the amdSYM 
cassette flanked by SHR-sequences G and F led to integration of the cassette at the l-Scel-restricted locus, (c) Integration of multiple overlapping 
fragments, using the same integration approach described in (b), leading to l-Scel-assisted integration of seven glycolytic genes and a KILEU2 
marker cassette into the SPR3 locus. 



first experiment and the amdSYM marker cassette were 
cotransformed to I-Scel-expressing cells of the platform 
strain IMX221 (Fig. 4c). Two control sets of fragments 



were also tested. An incomplete set of cassettes (lacking 
HXK2 BC ) was used to estimate the occurrence of nonho- 
mologous recombination events within the construct. 
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Secondly, a single cassette carrying the selection marker 
but without homology to the integration site was used to 
evaluate the possible integration of nonhomologous 
fragments. Transformation of I-Scel-expressing cells with 
these two control sets of fragments did not yield transfor- 
mants. Conversely, transformation of I-Scel-expressing 
cells with the complete set of fragments resulted in 336 
transformants capable of using acetamide as sole nitrogen 
source. Analysis by multiplex PCR of ten randomly 
picked clones demonstrated the integration of a full set of 
fragments in the SPR3 locus of S. cerevisiae for nine 
clones (Fig. 5a). To evaluate the robustness of the CATI 
approach, the same experiment was repeated by replacing 
the dominant amds marker by the widely used auxotro- 
phic selection marker LEU2. A new platform strain was 
constructed by introducing the same synthetic locus used 
in IMX221 in the leucine auxotrophic strain S. cerevisiae 
CEN.PK102-3A, resulting in strain IMX222. Subsequently, 
the above-described cassettes carrying the seven glycolytic 
genes were cotransformed to IMX222 together with the 



Kluyveromyces lactis LEU2 orthologous marker cassette. 
The _K7LE[/2-based CATI resulted in 470 clones, of which 
ten clones were analyzed by multiplex PCR. These ten 
clones harbored all expected amplicons, indicating the 
correct integration of all eight fragments at the targeted 
locus (Fig 5b). These results demonstrate the high 
efficiency (c. 95%) and robustness of I-Scel-assisted 
simultaneous assembly and chromosomal integration of 
eight overlapping DNA cassettes, comprising a 22-kb con- 
struct, in S. cerevisiae. 

Discussion 

The current demand from the biotechnology industry for 
strains able to produce complex synthetic pathways with 
ever increasing productivity and robustness requires the 
construction of strains simultaneously expressing dozens 
of homologous and heterologous genes (Ro et al, 2006; 
Koopman et al, 2012). Genetic tools enabling the fast 
and efficient construction of chromosome-borne large 
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Fig. 5. Characterization of positive clones isolated after l-Scel-assisted CATI of ten fragments by multiplex PCR. (a) PCR patterns of ten clones 
resulting from cotransformation of the glycolytic genes with the amdS selection marker, (b) PCR patterns of ten clones obtained by replacing 
amdS by the KILEU2 selection marker in the cotransformation with the glycolytic genes. Transformants were randomly picked and analyzed by 
multiplex PCR producing amplicons covering the indicated junctions. PCR products were separated on a 2% agarose gel by electrophoresis. In 
lanes labeled 'L', a 50-bp Gene Ruler ladder was loaded; sizes are indicated. From these 20 tested clones, a single one [(a), transformant number 
10, amplicon C] did not display the expected pattern. 
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synthetic constructs are therefore urgently needed. 
Comparatively little attention has been given to the 
development of such methods to date. One study 
explored the possibilities of combining recombination- 
based cloning with chromosomal integration in S. cerevi- 
siae (Shao et al, 2009) and provided a clear proof of 
principle. Eight genes were assembled in vivo into a 23.7- 
Kb construct and successfully integrated into a 5-site. 
Using this approach, variable number of clones (30-50 
clones) and efficiencies (10% to 70%) were obtained 
depending on the length of the overlapping sequences 
used, the lowest efficiency being obtained using short 
overlaps of 50 bp. In the present study, implementation 
of I-Scel-assisted combined in vivo assembly and targeted 
chromosomal integration in S. cerevisiae led to consis- 
tently high efficiencies of c. 95% for a construct of a simi- 
lar size using 60-bp overlapping sequences and a similar 
number of fragments. Furthermore, while transformation 
to 8-sites inherently randomizes the location and number 
of integration sites, the present work is, to the best of our 
knowledge, the first published report of targeted integra- 
tion of an in vivo-assembled multigene construct in S. ce- 
revisiae. The SHR-sequences used in the presented 
platform for the assembly of the fragments add versatility 
to the system, which allows for parallel construction of 
replicative and integrative constructs. These achievements 
present a new step toward reliable and robust high- 
throughput strain construction. We are currently using 
the I-Scel-assisted CATI approach to assemble and inte- 
grate complete pathways up to 35 kb from 15 fragments, 
and the efficiency of correct assembly is similar to the 
efficiencies in this study. While the number of transfor- 
mants seems to decrease with the number of assembled 
fragments (c. 400 clones with nine fragments and c. 200 
clones with 15 fragments), the number of transformants 
obtained is sufficiently high to be compatible with high- 
throughput strain construction programs. 

The present study indicates that the critical step in 
chromosomal integration and the key to the high efficien- 
cies obtained for combined assembly and integration of a 
multifragment pathway is the introduction of a double- 
strand DNA break at the integration locus targeted. 
Because of its high specificity and resulting lack of toxic- 
ity in most tested organisms, I-Scel has been employed in 
many systems to induce site-specific double-strand breaks, 
such as induction of homologous recombination in 
higher eukaryotes (Rouet et al, 1994; Choulika et al, 
1995), seamless gene modifications in yeast (Noskov 
et al, 2010; Khmelinskii et al, 2011), and sequential 
pathway engineering in yeast (Wingler & Cornish, 2011). 
In recent years, several approaches have been explored to 
engineer synthetic endonucleases for any recognition 
sequence of interest: the zinc-finger nucleases (ZFNs; 



Kandavelou et al, 2005), the TAL effector nucleases 
(TALENs; Christian et al, 2010), and the recent 
RNA-guided CRISPR/Cas nucleases (Cong et al, 2013). 
Those synthetic 'genomic scissors' could greatly contribute 
to further development of the CATI method for chromo- 
somal modifications. Screening of those synthetic endonuc- 
leases for site-specific nuclease activity in S. cerevisiae 
might reveal even more efficient DSB inducers, thereby fur- 
ther improving the targeting efficiency for integration. Fur- 
thermore, design of synthetic meganucleases for specific 
recognition sequences already present in the yeast genome 
could abolish the need for a defined synthetic locus to tar- 
get for integration. While in the present approach the 
meganuclease was expressed by the host organism, it has 
been previously shown that endonucleases can enter yeast 
cells and reach the nucleus during transformation (Schiestl 
& Petes, 1991). Cotransformation of cells with the desired 
fragments and custom-made meganucleases would make 
the presented method applicable to any strain without any 
prior modification of the genome. We therefore anticipate 
that the coming years will see further increases in the flexi- 
bility and ease of endonuclease-assisted CATI for strain 
engineering. 

While S. cerevisiae is known for its high efficiency of 
homologous recombination (Schiestl & Petes, 1991; 
Gibson et al, 2008), the present work demonstrates that 
integration of DNA fragments in its genome can be sub- 
stantially increased by introduction of a DSB. Microor- 
ganisms in which it is notoriously difficult to achieve 
highly efficient targeted integration of DNA fragments, 
such as the yeast Kluyveromyces lactis and filamentous 
fungi (Kooistra et al, 2004; Ninomiya et al, 2004; Krapp- 
mann et al, 2006; Snoek et al, 2009), may similarly ben- 
efit from endonuclease-assisted integration. Although the 
presented method has been engineered for S. cerevisiae, 
the principle could be applied to any organism with an 
efficient homologous repair mechanism. Alternatively, 
S. cerevisiae and its outstanding recombination efficiency 
could be exploited to assemble the in vivo DNA con- 
structs prior to transformation to the final host. 

The presented I-Scel-assisted CATI approach has dras- 
tically changed the strain engineering procedures in our 
laboratory and opened new possibilities for large-scale 
metabolic engineering of S. cerevisiae. It is our hope that 
this work will further contribute to the development of 
S. cerevisiae as a valuable platform for the production of 
many industrially relevant compounds. 
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